Text Analysis Meets Computational Lexicography
نویسنده
چکیده
منابع مشابه
Data Mining Meets Collocations Discovery
In this paper we discuss the problem of discovering interesting word sequences in the light of two traditions: sequential pattern mining (from data mining) and collocations discovery (from computational linguistics). Smadja (1993) defines a collocation as “a recurrent combination of words that cooccur more often than chance and that correspond to arbitrary word usages.” The notion of arbitrarin...
متن کاملA Heuristic Algorithm for Nonlinear Lexicography Goal Programming with an Efficient Initial Solution
In this paper, a heuristic algorithm is proposed in order to solve a nonlinear lexicography goal programming (NLGP) by using an efficient initial point. Some numerical experiments showed that the search quality by the proposed heuristic in a multiple objectives problem depends on the initial point features, so in the proposed approach the initial point is retrieved by Data Envelopment Analysis...
متن کاملBook Review: A Way with Words: Recent Advances in Lexical Theory and Analysis: A Festschrift for Patrick Hanks edited by Gilles-Maurice de Schryver
In his introduction to this collection of articles dedicated to Patrick Hanks, de Schryver presents a quote from Atkins referring to Hanks as “the ideal lexicographer’s lexicographer.” Indeed, Hanks has had a formidable career in lexicography, including playing an editorial role in the production of four major English dictionaries. But Hanks’s achievements reach far beyond lexicography; in part...
متن کاملLexicalised compositionality
In this paper, we propose an approach to distributional semantics which can be formally related to a simple model-theoretic approach. We describe treatments of some of the traditional lexical semantic relationships within this framework, and also outline accounts of some phenomena which have been considered within Generative Lexicon theory. We further argue that distributions should be based on...
متن کاملTowards an integrated representation of multiple layers of linguistic annotation in multilingual corpora
There has been an increasing interest in recent years in the enrichment of natural language corpora in terms of annotation with explicit linguistic information. This interest manifests itself most prominently in two areas of linguistics: corpus linguistics and computational linguistics. For corpus linguistics, the long standing practice has been to work on raw, i.e., unannotated text. While raw...
متن کامل